Analysis of Stopping Active Learning based on Stabilizing Predictions
نویسندگان
چکیده
Within the natural language processing (NLP) community, active learning has been widely investigated and applied in order to alleviate the annotation bottleneck faced by developers of new NLP systems and technologies. This paper presents the first theoretical analysis of stopping active learning based on stabilizing predictions (SP). The analysis has revealed three elements that are central to the success of the SP method: (1) bounds on Cohen’s Kappa agreement between successively trained models impose bounds on differences in F-measure performance of the models; (2) since the stop set does not have to be labeled, it can be made large in practice, helping to guarantee that the results transfer to previously unseen streams of examples at test/application time; and (3) good (low variance) sample estimates of Kappa between successive models can be obtained. Proofs of relationships between the level of Kappa agreement and the difference in performance between consecutive models are presented. Specifically, if the Kappa agreement between two models exceeds a threshold T (where T > 0), then the difference in F-measure performance between those models is bounded above by 4(1−T ) T in all cases. If precision of the positive conjunction of the models is assumed to be p, then the bound can be tightened to 4(1−T ) (p+1)T .
منابع مشابه
A Method for Stopping Active Learning Based on Stabilizing Predictions and the Need for User-Adjustable Stopping
A survey of existing methods for stopping active learning (AL) reveals the needs for methods that are: more widely applicable; more aggressive in saving annotations; and more stable across changing datasets. A new method for stopping AL based on stabilizing predictions is presented that addresses these needs. Furthermore, stopping methods are required to handle a broad range of different annota...
متن کاملDeciding when to stop: Efficient stopping of active learning guided drug-target prediction
Active learning has shown to reduce the number of experiments needed to obtain high-confidence drug-target predictions. However, in order to actually save experiments using active learning, it is crucial to have a method to evaluate the quality of the current prediction and decide when to stop the experimentation process. Only by applying reliable stoping criteria to active learning, time and c...
متن کاملStopping Criteria for Active Learning of Named Entity Recognition
Active learning is a proven method for reducing the cost of creating the training sets that are necessary for statistical NLP. However, there has been little work on stopping criteria for active learning. An operational stopping criterion is necessary to be able to use active learning in NLP applications. We investigate three different stopping criteria for active learning of named entity recog...
متن کاملTransductive Confidence Machine for Active Learning
This paper describes a novel active learning strategy using universal p-value measures of confidence based on algorithmic randomness, and transductive inference. The early stopping criteria for active learning is based on the bias-variance tradeoff for classification. This corresponds to that learning instance when the boundary bias becomes positive, and requires one to switch from active to ra...
متن کاملInvestigating the Causes of Divorce through Narrative Analysis in Yazd City and Designing a Prerequisite Education based on the Causes of Divorce using a Hidden Learning Approach on the basis of Family, School, and Student
Introduction: Today, divorce is a well-known and dangerous social phenomenon that disintegrates families and corrupts the society. Therefore, this study aimed to investigate the causes of divorce through narrative analysis in Yazd City and to design a prerequisite education based on the causes of divorce using a hidden learning approach on the basis of family, school, and student approach. Met...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013